Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation
نویسندگان
چکیده
We propose a convolutional network with hierarchical classifiers for per-pixel semantic segmentation, which is able to be trained on multiple, heterogeneous datasets and exploit their semantic hierarchy. Our network is the first to be simultaneously trained on three different datasets from the intelligent vehicles domain, i.e. Cityscapes, GTSDB and Mapillary Vistas, and is able to handle different semantic levelof-detail, class imbalances, and different annotation types, i.e. dense per-pixel and sparse bounding-box labels. We assess our hierarchical approach, by comparing against flat, nonhierarchical classifiers and we show improvements in mean pixel accuracy of 13.0% for Cityscapes classes and 2.4% for Vistas classes and 32.3% for GTSDB classes. Our implementation achieves inference rates of 17 fps at a resolution of 520 x 706 for 108 classes running on a GPU.
منابع مشابه
Pixel-Level Encoding and Depth Layering for Instance-Level Semantic Labeling
Recent approaches for instance-aware semantic labeling have augmented convolutional neural networks (CNNs) with complex multitask architectures or computationally expensive graphical models. We present a method that leverages a fully convolutional network (FCN) to predict semantic labels, depth and an instance-based encoding using each pixel’s direction towards its corresponding instance center...
متن کاملMonocular Depth Estimation by Learning from Heterogeneous Datasets
Depth estimation provides essential information to perform autonomous driving and driver assistance. Especially, Monocular Depth Estimation is interesting from a practical point of view, since using a single camera is cheaper than many other options and avoids the need for continuous calibration strategies as required by stereo-vision approaches. State-of-theart methods for Monocular Depth Esti...
متن کاملSemantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks
This work investigates the use of deep fully convolutional neural networks (DFCNN) for pixel-wise scene labeling of Earth Observation images. Especially, we train a variant of the SegNet architecture on remote sensing data over an urban area and study different strategies for performing accurate semantic segmentation. Our contributions are the following: 1) we transfer efficiently a DFCNN from ...
متن کاملIntegration of Deep Learning Algorithms and Bilateral Filters with the Purpose of Building Extraction from Mono Optical Aerial Imagery
The problem of extracting the building from mono optical aerial imagery with high spatial resolution is always considered as an important challenge to prepare the maps. The goal of the current research is to take advantage of the semantic segmentation of mono optical aerial imagery to extract the building which is realized based on the combination of deep convolutional neural networks (DCNN) an...
متن کاملTraining Constrained Deconvolutional Networks for Road Scene Semantic Segmentation
In this work we investigate the problem of road scene semantic segmentation using Deconvolutional Networks (DNs). Several constraints limit the practical performance of DNs in this context: firstly, the paucity of existing pixelwise labelled training data, and secondly, the memory constraints of embedded hardware, which rule out the practical use of state-of-the-art DN architectures such as ful...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018